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For individual molecules quantum mechanics (QM) offers a simple, natural and elegant way to build 
large-scale complex networks: quantized energy levels are the nodes, allowed transitions among the levels are 
the links, and transition intensities supply the weights. QM networks are intrinsic properties of molecules 
and they are characterized experimentally via spectroscopy; thus, realizations of QM networks are called 
spectroscopic networks (SN). As demonstrated for the rovibrational states of H 2 16 0, the molecule governing 
the greenhouse effect on earth through hundreds of millions of its spectroscopic transitions (links), both the 
measured and first-principles computed one-photon absorption SNs containing experimentally accessible 
transitions appear to have heavy-tailed degree distributions. The proposed novel view of high-resolution 
spectroscopy and the observed degree distributions have important implications: appearance of a core of 
highly interconnected hubs among the nodes, a generally disassortative connection preference, considerable 
robustness and error tolerance, and an "ultra-small-world" property. The network-theoretical view of 
spectroscopy offers a data reduction facility via a minimum-weight spanning tree approach, which can assist 
high-resolution spectroscopists to improve the efficiency of the assignment of their measured spectra. 

High-resolution molecular spectroscopy is one of the high-end analytical tools which can be used to obtain 
detailed chemical information about complex natural systems. These systems include the earth's atmo- 
sphere, where spectroscopy helps to understand the greenhouse effect, and astronomical bodies of our 
universe, where spectroscopy helps, among other things, to answer principal questions concerning life on earth. 
The extensive spectroscopic data required by related modelling efforts have been consolidated into information 
systems 111 . The data deposited in these information systems traditionally come from a large number of high- 
resolution experimental investigations. Experiments are usually done by different groups employing different 
techniques in different regions of the spectrum, resulting in a broad range of data accuracy. The relative accuracy 
of transition frequencies detected in the lab ranges from 10~ 5 to 10~ 10 , while for transition intensities it is only 
1CT 2 . As to theory, in the fourth age of quantum chemistry 12 it is possible to determine accurate high-resolution 
spectroscopic data and spectra 1314 . To satisfy the demand of modellers, for a number of small molecules nearly 
complete first-principles linelists have been computed 15 . These lists contain from thousands to millions of entries 
in the form of rotational-vibrational-electronic energies and transitions and their most important characteristics 
(e.g., quantum numbers, symmetries, and intensities). 

Although high-resolution spectroscopic experiments yield highly accurate data, at the same time these data are 
highly incomplete. For example, the 5 000 experimental eigenenergies reported by Mellau 16-18 are complete up to 
7 000 cm" 1 above the HCN ground state, yet they cover only 98 vibrational states. The 25 000 rovibrational states 
determined in these high-resolution infrared emission studies correspond only to 1 5% of the vibrational states up 
to isomerization. When compared with experimental data, ab initio linelists show the following important 
characteristics: while the relative accuracy of the ab initio energy levels is 10 to 10 000 times worse than that 
of typical experimental data, most of the transition intensities have accuracies similar to experimental data. The 
striking disparity between the accuracy and the number of first-principles computed and experimentally mea- 
sured energy levels and transitions and the fact that in many cases ab initio intensities may directly be used for 
high resolution analyses leads to the conclusion that for the foreseeable future one should consider the com- 
bination of experimental and ab initio information to satisfy the needs of modellers, who often require nearly 
complete high-resolution (line by line) spectroscopic data 19 . In turn, this conclusion leads immediately to ques- 
tions how results of the various experiments should be viewed, how experimental and theoretical data could be 
unified, how ab initio data may be used to simplify the assignment of measured spectra, and how to build the most 
dependable information systems containing line-by-line spectroscopic data. 

We believe that to obtain the best answers to these questions one should consider the energy levels and the 
spectroscopic transitions of a molecule from the point of view of graph theory. Thus, earlier we introduced the 
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Figure 1 Visual representation ofthe first-principles spectroscopic networks ofH 2 16 0 in absorption with an intensitycut-off of 10 20 , 10 22 ,andl0 24 
cm molecule -1 , from left to right, with clearly visible ortho and para components and buildup of hubs. 



concept of spectroscopic networks (SN) 20 ~ 24 , where quantized energy 
levels are the nodes (vertices) and allowed transitions among the 
levels are the links (edges) of a graph (see Fig. 1). SNs are considered 
to be an intrinsic property of molecular systems, though character- 
istics of SNs can be slightly different based on how we actually probe 
these systems experimentally (e.g., in absorption or in emission). SNs 
provide a convenient representation of the experimental and theor- 
etical data and ways for their most advantageous unification, as well. 

In this paper we extend the network-theoretical analysis of SNs 
and, furthermore, develop novel tools for high-resolution spectro- 
scopy research based on the concept of SNs. We use H 2 16 0 as the 
model system of our present investigation. The SN of the H 2 16 0 
molecule is chosen for several reasons. Water is the most abundant 
polyatomic molecule in the Universe. It is present in many different 
environments and at many different temperatures. Detailed char- 
acterization of the spectroscopic properties of this triatomic molecule 
is needed to understand and predict the greenhouse effect on earth 
and its spectroscopy is of high astrophysical and astrochemical rel- 
evance. Furthermore, H 2 ls O was the subject of a large number of 
experimental high-resolution spectroscopic studies validated 
recently 25 . This experimental dataset of H 2 ls O, one of the spectro- 
scopically most thoroughly studied molecules, contains 14 319 nodes 
(energy levels) and 97 868 unique links (transitions) 25 . A high-quality 
first-principles linelist 26 , including energy levels, assignments, tran- 
sitions, and Einstein A coefficients, is also available for H 2 16 0. This 
computed, so-called BT2 linelist contains altogether 221 097 nodes 
and 505 806 255 links. Based on the number of nodes and links and 
the underlying structure one can conclude that even this simple 
triatomic molecule corresponds to a very complex system if the 
allowed one-photon transitions among its quantized energy levels 
are considered. 

Spectroscopic networks 

A graph G, corresponding to an SN of a molecule, say H 2 16 0, is an 
ordered pair, G = (L,T), where L is the set of energy levels (vertices) 



and Tis a set of transitions (edges), the edges being 2-element subsets 
of L (see Fig. 1). The number of transitions that emanate from an 
energy level is called the degree of the level. SNs do not contain loops 
and since different experiments may measure the same transitions, 
SNs corresponding to experiments are in fact multigraphs. First- 
principles SNs are, on the other hand, simple graphs. SNs contain 
a large number of cycles of widely differing size. In SNs non-negative 
transition intensities, different for different experimental techniques, 
are assigned to edges as weights. In summary, SNs are large, finite, 
weighted, and rooted graphs. 

Construction of a first-principles SN goes through the following 
steps: (1) take all (available) energy levels for the given molecule as 
nodes; (2) use the quantum chemical selection rules appropriate for 
the molecule and the experiment to link the nodes; and (3) add the 
intensities as weights to the links based on the type of experiment and 
the chosen temperature. The number of links in the graph built is 
naturally much smaller than all the possible links between the nodes. 
Consequently, the corresponding adjacency matrix is extremely 
sparse. In the particular case of H 2 16 0, consideration of nuclear spins 
results in two distinct connection schemes. In the language of graph 
theory these are components of a network. The two principal com- 
ponents (PC) correspond to the two nuclear spin isomers (usually 
called "ortho" and "para") of H 2 16 0 and both have unique roots. 
Selection rules cause the two PCs of the SN of H 2 16 0 to be bipartite 
graphs. This interesting fact explains why only even-numbered 
cycles exist in the SN of H 2 16 0 and of molecules of a similar nature 27 . 

Measurements map only a very limited part of an SN and yield a 
graph called A m . The intensity of the transitions is responsible for the 
incompleteness of A m as below a certain intensity it is impossible to 
detect a transition in a given type of experiment. Using the intensity 
as a cut-off parameter, a series of model networks can be constructed 
from the complete SN built upon the BT2 linelist 26 . We used the 
following cut-off parameters to construct model networks for the 
examination of the evolution of one-photon absorption SNs: 10~ 2 °, 
10" 22 , 10~ 24 , 10~ 26 , and 10~ 28 cm molecule" 1 (see Fig. 1 for a visual 
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Figure 2 | Distribution of links among nodes given as log-log size-frequency [logic — logP(ic)] plots for the measured (A m , left panel) and a first- 
principles (A 2 s, right panel) spectroscopic network of one-photon absorption transitions for H 2 16 0. 



representation of three of the first-principles model SNs and Table 1 
for details about these SNs, including the number of nodes and links 
they possess). To emphasize that these SNs belong to absorption, the 
corresponding graphs are called A 20 — A 28 . 

Floating components (FC), those which do not connect to the 
roots of PCs, arise frequently in measurements. Since no known 
transitions exist between the two PCs of the rovibrational SN of 
H 2 16 0, the absolute energy of the higher-energy root, set to a relative 
energy of zero by definition, can be determined only from an outside 
source, hindering the high-accuracy absolute determination of all 
measured energy levels. Artificial transition energies connecting 
roots of SNs may be called "magic numbers". The traditional route 
to obtain them is provided by highly accurate model Hamiltonians. A 
network-theoretical possibility is to take advantage of omnipresent 
degeneracies of certain higher- energy rovibrational levels in the two 
PCs, which can be identified straightforwardly by fourth-age 12 vari- 
ational nuclear- motion computations. These degeneracies are able to 
connect the distinct components via zero-energy artificial transi- 
tions. This was done in Ref. 25 for H 2 16 0 and in Ref. 28 for D 2 16 0 
with the comforting result that the network-theoretical and model 
Hamiltonian approaches yield the same magic number. 

Degree distributions 

For many observables there is a typical mean value they cluster 
around. As to SNs, where the number of experimentally measured 
links is about an order of magnitude larger than the number of 
nodes 25,27,29-32 , the question is whether there is a mean value for the 
number of transitions that an "average" energy level has. To answer 
this question one needs to investigate the distribution of the links 
among the nodes. 

Fig. 2 depicts the size-frequency [log/c — logP(fc)] plots for the A m 
and A 2S SNs of H 2 ls O. One can find a very broad distribution and, 
apart from the very low and very high k part, a reasonably linear 
relationship in both cases. As detailed in the Methods section, an 
elaborate search has been performed to estimate the form of the 
underlying discrete degree-distribution functions of these and the 
other model SNs. The search included a power-law form of P(k) 
cx k~ 7 , where y is the scaling index, as well as exponential and log- 
normal forms. The analyses indicate a definitely heavy-tailed and, 
after constraining k to the middle range, a power-law-like behavior 
with a scaling index of about 2 (Table 2, vide infra). As found for 
many complex networks 33 35 , it is not possible to distinguish between 
the power-law and the log-normal distributions but the exponential 
distribution is definitely not compatible with the data. The observed 



heavy-tailed distribution is one of the most important overall char- 
acteristics of SNs and it seems to be generally valid for the PCs of 
SNs 23 . 

Whether the degree distribution follows a power law or it is just 
simply top heavy, the degree distribution functions obtained suggest 
that SNs are characterized by hubs, i.e., a small number of nodes with 
a large number of connections. As expected, the most important hubs 
in a room-temperature absorption spectrum are on the ground 
vibrational state, (0 0 0), where {y l v 2 v 3 ) are approximate vibrational 
quantum numbers corresponding to symmetric stretch, bend, and 
antisymmetric stretch, respectively. For A m the hubs are as follows: 
/jfaKc = 6 34 , 5 23 , and 4 23 , with 458, 455, and 447 links, respectively 25 , 
where JkhKc is the standard rigid-rotor-type quantum number nota- 
tion applied for asymmetric top molecules, such as H 2 16 0. In the A 28 
SN the energy levels with the largest number of transitions are 
6 34 (1487), 5 23 (1433), and 6 25 (1431), where the number of links is 
given in parentheses. Remarkably, the two largest hubs coincide, 
proving how extensive the experimental investigations are for 
H 2 16 0. Note that the most important hub for HD 16 0 in absorption 
is also the (0 0 0)6 34 level 23 . 

To investigate the hubs of SNs further we determined an SN cor- 
responding to emission created from the first-principles BT2 linelist 
with an intensity cut-off of 10~ 20 cm molecule -1 at 1650 K, which 
could be called £ 20 . In emission the hubs with the largest number of 
connections belong to different vibrational states, they are the 
(0 2 0)9 63 , (0 0 1)633, and (0 1 0)10 38 levels with 102, 101, and 100 
links, respectively. The most important hubs in absorption appear to 
be important hubs in emission but the reverse is obviously not true. 

Detailed comparison of the connectivity of measured and first- 
principles hubs helps to determine the "weakest", least well deter- 
mined hubs within A m . This allows the design of new experiments 
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which help to determine a more accurate and robust experimental 
description of the SN with a minimum amount of effort. 

One can also ask the question whether the hubs with the largest 
number of links take part in the most intense transitions. The answer 
is a clear no. The 6 34 , 5 23 , and 4 23 pure rotational energy levels take 
part in the 16 th , 18 th , and 13 th most intense rovibrational absorption 
transitions, respectively. Vice versa, the two energy levels taking part 
in the most intense transition are only 69 lh and 89 th in the list of hubs 
based on the number of connections. 

Complexity measures 

Complexity of a graph G can be assessed by several metrics 35-39 . Three 
of them, C(G), S(G), and r(G) have been investigated in this study 
(see Table 1). 

The local clustering coefficient, C(G) 38 , quantifies how close local 
graphs are to being a complete graph. This metric cannot be used for 
the bipartite PCs of the model SNs of H 2 16 0 as bipartite graphs do not 
contain odd-numbered cycles such as triangles. 

A second metric is the structural metric (s-metric) with the cor- 
responding S(G) value 39 (see the Methods section for details). The 
S(G) values of the different networks investigated are collected in 
Table 1. 

As shown by Newman 36 , social networks seem to show "assort- 
ative mixing", i.e., their high-degree vertices preferentially attach to 
other high-degree vertices. On the contrary, technological and bio- 
logical networks tend to show 36 "disassortative mixing", i.e., their 
high-degree vertices attach to low-degree ones. A graph assortativity 
measure is the Pearson correlation coefficient, r(G) 39 . The r(G) values 
for the first-principles and measured SNs investigated are given in 
Table 1. For details see the Methods section. 

Ordinarily 36-37 , one expects a large value of S(G) to be associated 
with a large positive r(G) value. As seen in Table 1, the S(G) and r(G) 
values decrease when the intensity cut-off parameter of the first- 
principles SNs is decreased. This unusual behavior can be rationa- 
lized once the evolution of the underlying SNs is understood. If we 
examine the smallest model SN, A 2 o (see the leftmost panel of Fig. 1 
for its visual representation), we find that it contains only two com- 
ponents (it would not be surprising if the energy levels involved in the 
largest intensity lines would produce several components but this is 
not the case here). In these two components, containing the most 
intense transitions, the likelihood of connections among high-degree 
nodes (hubs) is high; in other words, their eigenvalue centrality 37 is 
high. This is the reason why the S(G) value is relatively large, while 
r(G) is close to zero. While the r(G) value of A 20 is negative, the 
corresponding large S(G) value indicates that this graph is disassor- 
tative with hubs showing an assortative behavior. This means that in 
A 20 hubs do like to connect to each other but each hub has many 
connections to low-degree nodes. Investigating the other SNs we can 
make another interesting and important observation: the nodes char- 
acterized as hubs do not change with the cut-off parameter. Of the 
first 100 hubs of the model A 20 and A 28 SNs 98 are common, meaning 
that the hubs already appear in the smallest SN and hubs remain hubs 
when the SN is enlarged. When increasing the size of the SN by 
decreasing the intensity cut-off parameter, the number of low-degree 
nodes increases substantially and the ratio of the connections among 
high-degree nodes to that of high-low connections decreases. This is 
the reason why the S(G) values show a decreasing tendency when 
going from A 20 to A 28 and the SNs become increasingly disassorta- 
tive. Note also how nicely the experimental SN, A m , fits this picture, 
supporting these findings about SNs. 

Small worlds 

The small world and ultra-small world properties of graph theory 
characterize networks where the average path length, defined as the 
average length of the shortest paths, of two arbitrarily chosen nodes 
scales as —logNor —loglogN, respectively, where N is the number of 



nodes in the network. Scale-free networks are closer to ultra-small 
worlds 40 . Heuristically this means that most vertices are within reach 
via a small number of steps. 

The structure resulting from the extreme number of connections 
within a particular SN can be described efficiently by two numbers, 
the diameter and the average path length. Of the possible definitions 
of a diameter we use the one which states that the diameter of a 
network, d(G), is the maximal shortest path between any two ver- 
tices. The diameters and the average path lengths of the SNs studied 
are given in Table 1. The average path length for the first-principles 
and measured SNs of H 2 16 0 is only about 7, the measured SN has a 
slightly larger value. The diameter of the first-principles SNs grow as 
the size of the SN grows but remains at relatively small values. As the 
data of Table 1 suggest, SNs are ultra-small worlds. 

Network vulnerability 

A spectroscopic network becomes larger either via new measure- 
ments (for an experimental SN) or by a decrease in the intensity 
cut-off (for a first-principles SN). In either case, the number of tran- 
sitions increases substantially faster than the number of energy levels, 
in complete accord with the degree distribution observed. The num- 
ber of cycles within the network also increases drastically. As a result, 
SNs appear to be extremely robust. 

Robustness of SNs can be ascertained by random removal of 
nodes 41 . In scale-free networks removal of nodes leads to an increase 
in the diameter 41 . In SNs, after random removal of 10 to 90% of the 
nodes, d(G) reflects how the graph fragments and thus provides 
useful characteristics about SNs. The original diameter of the largest 
first- principles graph investigated, A 2S , is 34 (Table 1), and this value 
does not change until we randomly remove some 95% of the nodes. 
Then the diameter suddenly drops to 22. The observed robustness of 
the SN of H 2 16 0 can be explained by the nature of the selection rules 
leading to a bipartite graph and the presence of an assortative core of 
interconnected hubs. To prove the latter we note that in A 28 the first 
448 hubs, 1% of the nodes, own almost 40% of the links. On one 
hand, the probability of random removal of hubs is small, on the 
other hand, if we remove such hubs, another hub "takes over" in the 
graph, as hubs are 'well connected'. The situation is quite different 
when we attack the graph, i.e., we remove the high-degree nodes 
systematically. If we delete the first 200 hubs, 0.45% of the nodes, 
which have 20.45% of the links, the diameter reduces to 18. The 
extreme error tolerance is another characteristic property of SNs 
and this property is somewhat similar to that observed in other 
complex networks. 

Data reduction via SNs 

Since high-resolution spectroscopic measurements yield an extreme 
amount of information, the reduction of the data to manageable size 
is a basic challenge for the theory of spectroscopy. The standard 
solution is to use model Hamiltonians with a small number of para- 
meters and least-squares optimize these parameters to represent all 
the measured data 42 . In a way this means that spectroscopic transi- 
tions are converted to parameters yielding energy levels. These para- 
meters allow excellent interpolation but they may fail drastically 
when used to extrapolate beyond the measured range. 

SNs offer another data reduction facility via an inversion of transi- 
tions to energy levels. For example, the 500 million transitions of the 
BT2 linelist can be converted back to about 200 thousand energy 
levels. This feature of SNs has been exploited in the MARVEL 
(Measured Active Rotational-Vibrational Energy Levels) proced- 
ure 21,22 used, among other applications, to derive the IUPAC spec- 
troscopic database of water isotopologues 25-28,29,31,32 . 

The best way to reduce the information content of SNs is through 
the use of weighted spanning trees. By using weighted spanning 
trees 43 , see the Methods section, one can reduce the information 
contained in the huge number of measured transitions of the 
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complex A m network to a relatively small set of energy levels. Each 
link of A m has a widely different uncertainty. The network-theor- 
etical view allows to appreciate how cycles, containing a lot of extra 
information compared to, for example, minimum weight spanning 
trees, within a component of an SN help to fix the energy levels and 
tighten their uncertainties. 

Assignment of spectra 

High resolution spectroscopy is also a science (and art) of quantum 
number assignment of measured lines and levels. The traditional way 
of analysing high-resolution experimental spectra is the a priori 
assignment of lines with good and approximate quantum numbers 
followed by a fitting of the levels via a small number of spectroscopic 
parameters of a well-designed model Hamiltonian 42 . This type of 
assignment procedure fails in the case of highly excited rovibrational 
states and in general when the number of rovibrational transitions 
exceeds a limit corresponding to an acceptable analysis time. A com- 
bined microwave to visible spectrum of any polyatomic molecule is 
converted to a list of labelled eigenenergies 16-18 in a high-resolution 
study. 

Hereby we advocate a novel protocol for the assignment of 
spectra based on SNs: detect the lines in a measured high-resolu- 
tion spectrum leading to the largest number of new energy levels 
via an investigation of a suitable first-principles SN and assign the 
transitions with quantum numbers by mapping the ab initio line- 
list onto experimental spectra using graph theory. Taking the 
negative logarithm of the intensity of the transitions as the weight 
function for the transitions of the SN, the minimum-weight span- 
ning tree displays the transitions with the largest intensities; thus, 
it readily identifies the most intense and thus the practically most 
useful spectral features. An illustration of the concept is provided 
in Fig. 3. 

The proposed method based on graph theory allows the auto- 
mated and fast conversion of very large experimental datasets into 
complete eigenenergy lists. These lists are the starting points for the 
development of theoretical models connecting our physical and 
chemical view on molecules 18 . 

Finally, let's create an artificial spectrum, in order to show the 
utility of the weighted spanning-tree approach. The complete set of 
1 916 H 2 16 0 rovibrational energy levels up to 7 000 cm 1 is known 
with high-resolution accuracy from a MARVEL study 25 . Based 
on these energy levels a simulated room temperature absorption 
spectrum is obtained containing 45 266 allowed transitions with 
intensities larger than 10~ 28 cm molecule -1 . The corresponding min- 
imum-weight spanning tree contains 1914 transitions, the minimum 
number of intense transitions needed to convert the spectrum back to 
an energy list. This represents a significant, more than 20-fold reduc- 
tion in the data. In other words, analysis of only 1914 intense transi- 
tions yields the maximum number of energy levels that can be 
determined from this spectrum. It is worth adding that out of the 
45 266 lines 19 482, an order of magnitude more than minimally 
needed, have indeed been measured and assigned 25 , which is a likely 
unusually high degree of completeness. 

Conclusions 

Driven by the need of scientific and engineering applications, com- 
plex spectroscopic networks, perhaps as part of active databases 20 " 24 , 
are expected to become an intrinsic part of the description of the 
high-resolution spectra of molecules. A good opportunity to advance 
the field of high-resolution molecular spectroscopy and to turn data 
into knowledge, as emphasized in the article defining the fourth age 
of quantum chemistry 12 and confirmed here, is offered via the joint 
use of accurate experiments, accurate first-principles computations, 
and efficient mathematical and numerical algorithms provided by, 
for example, graph and database theory. 
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Methods 

An assumption at the beginning of this study was that a power-law distribution 
would be the best choice for modeling the degree distribution of SNs 23 . The in- 
depth analysis of the degree distributions of the SNs studied utilized a review 
article 43 and two codes: igraph [igraph is a free software package for creating and 
manipulating undirected and directed graphs, see http://igraph.sourceforge.net/] 
and an open-source Python package 44 . The density function of power-law dis- 
tributions can be written as P(k) ~ L(k) k~ v . This function is undefined for k = 0; 
hence, a suitable k min value must be defined. This fc min can be specified by various 
methods, e.g., choosing a noise threshold value or the minimum value in a given 
sample. Often the low end of the dataset, which contains small values 
compared to the whole data, does not follow a power-law behavior. Therefore, 
one can fit a power-law distribution for each value in the dataset acting as 
/Cmin and compute the best fit by minimalizing the Kolmogorov-Smirnov (KS) 
distance, p(KS), between the empirical data and the fitted model. After 
determining the parameters of the power-law distribution, we analyzed our 
hypothesis that the best model for the empirical degree distribution is the 
power-law one by implementing a one-sample KS test. We reject the hypothesis 
if the p values obtained from the test fall below 0.05. The results are summarized 
in Table 2. 

The KS test results suggest that the optimal fitting model depends heavily on the 
intensity cut-off value used to create the model SN. We observe that A 2 5 is a "sweet 
spot" graph in the power-law modelling of the first-principles absorption SN of 
H 2 16 0. By using lower absorption intensity cut-offs, one can no longer properly fit a 
power-law distribution to the dataset. 

Note that there are two observations which help to explain the observed 
behavior. First, as we incorporate transitions with smaller intensities the 
network does not expand in terms of new vertices but becomes denser. Second, 
we refer the reader to the section on complexity measures. As seen there, the 
intensities of transitions involving hubs are generally considerably larger than 
those of non-hub ones. This observation is responsible for the fact that while the 
number of edges increases, the new edges do not substantially boost the degree of 
the hubs. 

The normalization constant for discrete power-law distributions is l/£(y, fcmin) 4 *, 
where C(s, a) stands for the Hurwitz zeta function, 



(1) 



We note that we cannot model the empirical degree distribution of the current 
measured SN, A m , with a power-law distribution. The same algorithm as above leads 
us to a scaling index of 2.66 choosing 16 as the optimal rc min . However, the KS test 
gives ap value of 0.02; thus, we must reject the hypothesis that the dataset was drawn 
from a power-law distribution. 
The s-metric is defined by 



where d\ is the degree of node i. If we introduce s m . 



(2) 
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we can define the normalized s-metric used in the text as 

S(G)=s/s m!lx . 

The graph assortativity, r(G), is defined by the Pearson coefficient, 



r(G) = 



Em- i dj+dj 



(3) 



(4) 



(5) 



where / is the number of edges in the graph. 

To build a minimum- weight spanning tree from the SNs, we implemented 
Kruskal's algorithm 45 . For the weight function, the negative logarithm value of the 
intensities on the edges were used. Admittedly, a more accurate result can be achieved 
by multiplying the base intensity values by — 1 to obtain a weight function. 
Nevertheless, the differences are within the same order of magnitude and are neg- 
ligible for practical considerations; therefore, we believe the weight function 
employed is adequate. 
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